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Abstract 

The paper deals with asymptotic properties of the adaptive pro- 
cedure proposed in the author paper, 2007, for estimating a unknown 
nonparametric regression. We prove that this procedure is asymptot- 
ically efficient for a quadratic risk, i.e. the asymptotic quadratic risk 
for this procedure coincides with the Pinsker constant which gives a 
sharp lower bound for the quadratic risk over all possible estimates. 
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1 Introduction 



The paper deals with the estimation problem in the heteroscedastic non- 
parametic regression model 

y^ = Six^) + a^iS)^^, (1.1) 

where the design points Xj = j/n, S{-) is an unknown function to be esti- 
mated, iCj)i<j<n is ^ sequence of centered independent random variables with 
unit variance and (o'j(>S'))^<j<„ are unknown scale functionals depending on 
the design points and the regression function S. 

Typically, the notion of asymptotic optimality is associated with the op- 
timal convergence rate of the minimax risk (see e.g., Ibragimov, Hasmin- 
skii,1981; Stone, 1982). An important question in optimality results is to 
study the exact asymptotic behavior of the minimax risk. Such results have 
been obtained only in a limited number of investigations. As to the nonpara- 
metric estimation problem for heteroscedastic regression models we should 
mention the papers by Efromovich, 2007, Efromovich, Pinsker, 1996, and 
Galtchouk, Pergamenshchikov, 2005, concerning the exact asymptotic be- 
havior of the £2""sk and the paper by Brua, 2007, devoted to the efficient 
pointwise estimation for heteroscedastic regressions. 

Heteroscedastic regression models are largely used in financial mathe- 
matics, in particular, in problem of calibrating (see e.g., Belomestny, Reiss, 
2006). An example of heteroscedastic regression models is given by economet- 
rics (see, for example, Goldfeld, Quandt, 1972, p. 83), where for consumer 
budget problems one uses some parametric version of model (II. ip with the 
scale coefficients defined as 

a^{S) = Co + c,x^+c,S\x^), (1.2) 

where Cg, and C2 are some unknown positive constants. 

The purpose of the article is to study asymptotic properties of the adap- 
tive estimation procedure proposed in Galtchouk, Pergamenshchikov, 2007, 
for which a non-asymptotic oracle inequality was proved for quadratic risks. 
We will prove that this oracle inequality is asymptotically sharp, i.e. the 
asymptotic quadratic risk is minimal. It means the adaptive estimation pro- 
cedure is efficient under some the conditions on the scales (o"j(S'))^<j<„ which 
are satisfied in the case (11. 2p . Note that in Efromovich, 2007, Efromovich, 
Pinsker, 1996, an efficient adaptive procedure is constructed for heteroscedas- 
tic regression when the scale coefficient is independent of S, i.e. crAS) = a-. 
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In Galtchouk, Pergamenshchikov, 2005, for the model fll.ip the asymptotic 
efficiency was proved under strong the conditions on the scales which are not 
satisfied in the case (11.21) . Moreover in the cited papers the efficiency was 
proved for the gaussian random variables (^j)i<j<„ that is very restrictive for 
applications of proposed methods to practical problems. 

In the paper we modify the risk. We take a additional supremum over the 
family of unknown noise distributions like to Galtchouk, Pergamenshchikov, 
2006. This modification allows us to eliminate from the risk dependence on 
the noise distribution. Moreover for this risk a efficient procedure is robust 
with respect to changing the noise distribution. 

It is well known to prove the asymptotic efficiency one has to show that the 
asymptotic quadratic risk coincides with the lower bound which is equal to 
the Pinsker constant. In the paper two problems are resolved: in the first one 
a upper bound for the risk is obtained by making use of the non-asymptotic 
oracle inequality from Galtchouk, Pergamenshchikov, 2007, in the second 
one we prove that this upper bound coincides with the Pinsker constant. 
Let us remember that the adaptive procedure proposed in Galtchouk, Perga- 
menshchikov, 2007, is based on weighted least-squares estimates, where the 
weights are proper modifications of the Pinsker weights for the homogeneous 
case (when ai{S) = . . . = cr^{S) = 1) relative to a certain smoothness of the 
function S and this procedure chooses a best estimator for the quadratic risk 
among these estimators. To obtain the Pinsker constant for the model (11.11) 
one has to prove a sharp asymptotic lower bound for the quadratic risk in the 
case when the noise variance depends on the unknown regression function. 
In this usually, we minorize the minimax risk by a bayesian one for 

a respective parametric family. Then for the bayesian risk we make use of 
a lower bound (see Theorem 6.1) which is a modification of the van Trees 
inequality (see. Gill, Levit, 1995). 

The paper is organized as follows. In Section [2] we construct an adap- 
tive estimation procedure. In Section [3] we formulate principal the condi- 
tions. The main results are presented in Section HI The upper bound for 
the quadratic risk is given in Section O In Section E] we give all main steps 
of proving the lower bound. In Subsection 16.11 we find the lower bound for 
the bayesian risk which minorizes the minimax risk. In Subsection 16.21 we 
study a special parametric functions family used to define the bayesian risk. 
In Subsection 16.31 we choose a prior distribution for bayesian risk to maxi- 
mize the lower bound. Section [7| is devoted to explain how to use the given 
procedure in the case when the unknown regression function is non periodic. 
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In Section [H] we discuss the main results and their practical importance. The 
proofs are given in Section O The Appendix contains some technical results. 



2 Adaptive procedure 

In this section we describe the adaptive procedure proposed in Galtchouk, 
Pergamenshchikov, 2006. We make use of the standard trigonometric basis 
(0j)j>i in C2[0,l], i-e. 

01 (a;) = 1 , (f)^{x) = V2Tr.{2n[j/2]x) , j > 2 , (2.1) 

where the function Trj{x) = cos(x) for even j and Trj{x) = sin(x) for odd 
j; [x] denotes the integer part of x. 

To evaluate the error of estimation in the model (11. II) we will make use 
of the empiric norm in the Hilbert space £3 [0)1] 5 generated by the design 
points (a;j)i<j<„ of model (11.11) . To this end, for any functions u and v from 
£2[0, 1], we define the empiric inner product 

1 " 

1=1 

Moreover, we will use this inner product for vectors in M" as well, i.e. if 
u = {ui, . . . , M„)' and v = {v^, . . . , f„)', then 



1 1 " 

(u , v) = -u'v = - 

=1 



The prime denotes the transposition. 

Notice that if n is odd, then the functions (0j)i<j<„ are orthonormal with 
respect to this inner product, i.e. for any 1 < j < 



1 

-Y.HxMM^) = ^^^^^ (2-2) 

1=1 

where Kr^^- is Kronecker's symbol, Kr^^- = 1 if i = j and Kr^^- = for i ^ j. 

Remark 2.1. Note that in the case of even n, the basis (12. ip is orthogonal 
and it is orthonormal except the nth function for which the normalizing con- 
stant should be changed. The corresponding modifications of the formulas for 
even n one can see in Galtchouk, Pergamenshchikov, 2005. To avoid these 
complications of formulas related to even n, we suppose n to be odd. 
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Thanks to this basis we pass to the discrete Fourier transformation of 
model ffTTTD : 

= + ^^J,n ' (2-3) 

where 9-^^ = {Y, (p-)^, Y = {y-^, yj, „ = (5, 0^)^ and 

1 " 

We estimate the function S by the weighted least squares estimator 

n 

K = Y.^{j)Kn^,. (2.4) 

where the weight vector A = (A(l), . . . , A(?t,))' belongs to some finite set A 
from [0, 1]" with n > 3. 

Here we make use of the weight family A introduced in Galtchouk, Perga- 
menshchikov, 2008, i.e. 

A = {A,,aG^}, ^ = {l,...,r} X (2.5) 

where = ie and m = [^/e^]- We suppose that the parameters A;* > 1 and 
< £ < 1 are functions of n, i.e. k* = k* and e = e„, such that, 

k* 

lim„^^ k* = +00 , lim„^^ = , 

lim„^^ = and lim„^^ e„ = +00 , 

for any u > 0. For example, one can take for n > 3 

£„ = l/lnn and k* = k + Vlnn , 

where k is any nonnegative constant. 

For each a = {[3, t) e ^ we define the weight vector A^, = (A„(l), . . . , A„(n))' 

as 

Kij) = l{i<,<,j + (1 - ijMc.)r) la<,<.(.)} • (2.7) 
Here jo = jo(") = [^(") with 

uj{a) = uj + iAf, t)i/(2/3+i)^i/(2/3+i) ^ (2.8) 
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where uj is any nonnegative constant and 

_ (/3 + l)(2/3 + l) 

Remark 2.2. A'"oie that the "weighted least squares estimators (12 ■4p have 
been introduced by Pinsker, 1981, for continuous time optimal signal filter- 
ing in the gaussian noise. He proved that the mean-square asymptotic risk 
is minimized by weighted least squares estimators with weights of type (12.71) . 
Moreover he has found the sharp minimal value of the mean-square asymp- 
totic risk, which was called later as the Pinsker constant. Nussbaum, 1985, 
used the same method with proper modification for efficient estimation of the 
function S of known smoothness in the homogeneous gaussian model (11.11) . 
i.e. when ai{S) = . . . = cr„(S') = 1 and (Cj)i<j<„ is i.i.d. Af{0, 1) sequence. 

To choose weights from the set (12. 5p we minimize the special cost function 
introduced by Galtchouk, Pergamenshchikov, 2007. This cost function is as 
follows 

n n 
4(A) = Yl ^'(^■)^,n - 2 E ^(^■) ^J,n + PPnW , (2-9) 

i=i i=i 

where 

^.,n = ?,.-^?n With X: (2.10) 

and = [n^^'^ + 1]. The penalty term we define as 

n 



\X\^ = yX\j) and p = ^-r 



^n(A) = ^:^, \X\' = }_^yij) and p 
where > is any slowly increasing sequence, i.e. 



for any u > 0. 
Finally, we set 



lim L„ = +oo and lim — = 0, (2.111 

n^oo n— >oo iL 



A = argmin^g^ J„(A) and S^ = S^. (2.12) 



The goal of this paper is to study asymptotic (as n oo) properties of 
this estimation procedure. 
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Remark 2.3. Now we explain why does one choose the cost function in the 
form (I2.9p . Developing the empiric quadratic risk for estimate (12 ■4p . one 
obtains 



i=i i=i 

It's natural to choose the weight vector A for which this function reaches 
the minimum. Since the last term on the right-hand part is independent of 
\, it can he draped and one has to minimize with respect to A the function 
equals the difference of two first terms on the right-hand part. It's clear that 
the minimization problem cann't be solved directly because the Fourier coeffi- 
cients {6j are unknown. To overcome this difficulty, we replace the product 

^j,n^j,n '^^^ asymptotically unbiased estimator 6 - .^ (see, Galtchouk, Perga- 
menshchikov, 2007, 2008). Moreover, to pay this substitution, we introduce 
into the cost function the penalty term with a small coefficient p > 0. The 
form of the penalty term is provided by the principal term of the quadratic 
risk for weighted least-squares estimator, see Galtchouk, Pergamenshchikov, 
2007, 2008. The coefficient p > means, that the penalty is small, because 
the estimator 9- „ approximates in mean the quantity 9- „ 9^ „ asymptotically, 
as n ^ oo. 

Note that the principal difference between the procedure f l2.12p and the 
adaptive procedure proposed by Golubev, Nussbaum, 1993, for a homogeneous 
gaussian regression, consists in presence of the penalty term in the cost func- 
tion ^im . 

Remark 2.4. As it was noted at Remark \2.2\ Nussbaum, 1985, has shown 
that the weight coefficients of type (12.71) provide the asymptotic minimum of 
the mean-squared risk at the regression function estimation problem for the 
homogeneous gaussian model (11. ip . when the smoothness of the function S is 
known. In fact, to obtain an efficient estimator one needs to take a weighted 
least squares estimator (12.40 with the weight vector where the index a 
depends on smoothness of function S and on coefficients (crj(S'))]^<j<„, (see 
(15.20 below), which are unknown in our case. For this reason, Galtchouk, 
Pergamenshchikov , 2007, 2008, have proposed to make use of the family of 
coefficients (12.50 . which contains the weight vector providing the minimum 
of the mean-squared risk. Moreover, they proposed the adaptive procedure 
(I2.12P for which a non- asymptotic oracle inequality (see. Theorem 4-l\ below) 
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was proved under some weak conditions on the coefficients (o"j(S'))^<^<„. It 
is important to note that due the properties of the parametric family (12.61) . 
the secondary term in the oracle inequality is slowly increasing (slower than 
any degree ofn). 



3 Conditions 



First we impose some conditions on unknown function S in the model fll.ip . 

Let Cpg^j^(M) be the set of 1-periodic k times different iable M — >■ M func- 
tions. We assume that S belongs to the following set 



= {f e Cl^.^^m : 5^ 11/ 

i=o 

where || ■ || denotes the norm in C2^, 1], i.e. 

' f\t)dt. 



(J)||2 



< r] 



(3.1) 



(3.2) 



Moreover, we suppose that r > and k>l are unknown parameters. 
Note that, we can represent the set as an elhpse in C2^, 1], i.e. 



where 



and 



0, = UA,)= / /(%Wdt 



(3.3) 



(3.4) 



1 ^"'3 
1=0 



E(2^b'/2]) 



2i 



(3.5) 



i=0 



Here {(pj)j>i is the trigonometric basis defined in ( 12. ip . 

Now we describe the conditions on the scale coefficients {(7j{S))j^i. 

Hi) (Jj{S) = g{Xj, S) for some unknown function g : [0, 1] x £^[0, 1] 
which is square integrable with respect to x such that 



lim sup 

S&w!: 



1 " 

-Y.g\x^,S)-,{S) 



0, 



(3.6) 
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where q{S) :— g'^(x,S)dx. Moreover, 

g,= inf ini g\x,S) >0 (3.7) 

0<a;<l SeW^ 

and 

sup q{S) < oo . (3.8) 

H2) For any x e [0, 1], the operator g^{x, •) : C[0, 1] — > R is differentiate 
in the Frechet sense for any fixed function /q from C[0, 1] , i.e. for any 
f from some vicinity of /q in C[0, 1], 

g'ix, f) = g\x, /o) + L,,;^(/ - /o) + T(a;, /„, /) , 

where the Frechet derivative L^j : C[0, 1] — M zs a bounded linear 
operator and the residual term tIx, /q, /), for each x e [0, 1], satisfies 
the following property: 

ll/-/olloo^O II/-/0II00 

^here ||/||^ = sup^^^^^ \f{t)\. 

H3) There exists some positive constant C* such that for any function S 
from C[0, 1] the operator ^ defined in the condition H2) satisfies the 
following inequality for any function f from C[0, 1]; 

\Ksif)\<C*mx)f{x)\ + \f\ + \\S\\\\f\\), (3.9) 

where \f\,^J^\fit)\dt. 

H4) The function (7o(") = 9{'-i Sq) corresponding to Sq = is continuous on 
the interval [0, 1]. Moreover, 

lim sup sup \g{x, S) — g{x, Sq)\ — 0. 

"5-*0 0<x<l ||S'||^<5 



Remark 3.1. Let us explain the conditions H;^)-H4). In fact, this is the 
regularity conditions of the function g{x, S) generating the scale coefficients 

K('5))i<j<n- 
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Condition H^) means that the function g{-,S) should be uniformly in- 
tegrable with respect to the first argument in the sens of convergence (13.61) . 
Moreover, this function should be separated from zero (see inequality (13.71) ) 
and bounded on the class (13.11) (see inequality (13.81) ). Boundedness away 
from zero provides that the distribution of observations (yj)i<j<„ isn't degen- 
erate in M", and the boundedness means that the intensity of the noise vector 
should be finite, otherwise the estimation problem hasn't any sens. 

Conditions H2) and H3) mean that the function g{x, ■) is regular, at any 
fixed < X < 1, with respect to S in the sens, that it is differentiable in the 
Frechet sens (see e.g., Kolmogorov, Fomin, 1989) and moreover the Frechet 
derivative satisfies the growth condition given by the inequality (13.91) which 
permits to consider the example (ll.2p . 

Last the condition H4) is the usual uniform continuity the condition of 
the function g(-,-) at the function Sq. 

Now we give some examples of functions satisfying the conditions H^^)- 
H4). 

We set ^ 

g^{x,S) =Cq + c^x + c^S^{x) S^{t)dt (3.10) 

Jo 

with some coefficients Cq > 0, Cj > 0, z = 1, 2, 3. 
In this case 

,{S) = c, + ^ + {c, + c,) 1^ S\t)dt. 

The Frechet derivative is given by 

L,,sif) = ^S{x)f{x) + 2 fs{t)fit)dt. 

Jo 

It is easy to see that the function (13.101) satisfies the conditions H]^)-H4). 
Moreover, the conditions Hj^)-H4) are satisfied by any function of type 

g^{x,S) = G{x,S{x))+ [ V{S{t))dt, (3.11) 

Jo 

where the functions G and V satisfy the following the conditions: 
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G is a [0, 1] X M ^ [cq , +00) function (with Cq > 0) such that 



hm max sup \G{u,y) — G{v,y)\ = 

5^0 \u-v\<S . 



and 



7X1^= sup sup 



\GSx,y)\ 



< 00 



0<x<l y£R \y\ 

\^ is a continuously differentiable M R_|_ function such that 

\v{y)\ 



(3.12) 



(3.13) 



1712 — 



1 + 



< 00 



where V{-) is the derivative of V. 
In this case 



^(S)= I G(t,5(t))dt+ ! V{S{t))dt 



and 



<Y. / \G{x^,S{x^))-G{t,S{t))\ dt 
j=i "^^^-1 

<A^ + J2 r \G{t,S{x^))-G{t,S{t))\ 
j=i -^^j-i 



dt. 



where A„ = maX|„_^|<]^/„ sup^^jg \G{u,y) — G{v,y)\. Now to estimate the last 
term in this inequality note that 



G{t,S{x^))-G{t,S{t))= / Gy{t,S{z))S{z)dz. 

Jt 

Therefore, from the condition (13.131) we get 

\G{t,S{x^))-G{t,Sm<^i r \S{z)\\S{z)\dz.^ 



'-j-i 
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and through the Bounyakovskii-Cauchy-Schwarz inequahty, for any S G W'', 



n-'J2g\x^,S)-,{S) 



n 



G{xj,S{x^))-G{t,S{t))\ dt 



n 



<A„ + -i / mwsmt 



< A. + ^11^1111^11 <A. + ^r. 

It ft 

Now, the condition ( ]3.12p imphes H^^). 

Moreover, the Frechet derivative in this case is given by 



J^^,sif) = Gy{x,S{x))f{x)+ / ViSit))fit)dt. 

One can check directly that this operator satisfies the inequahty (13.91) with 
C* = + 1712. 



4 Main results 

Denote by P„ the family of distributions p in M" of the vectors (^j^, . . . , in 
the model (11. ip such that the components C,^ are jointly independent, centered 
with unit variance and 

maxEef<r, (4.1) 

l<k<n " 

where /* > 3 is slowly increasing sequence, that is it satisfies the property 

It is easy to see that, for any n > 1, the centered gaussian distribution 
in R"' with unit covariation matrix belongs to the family V^. We will denote 
by q this gaussian distribution. 

For any estimator S we define the following quadratic risk 

n^{S,S)=snpEsJS-S\\l, (4.2) 

where E^^ is the expectation with respect to the distribution P^p of the 
observations {y^, . . . , ?/„) with the fixed function 5* and the fixed distribution 
p eV^ of random variables (^j)i<j<„ in the model fll.ip . 
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Moreover, to make the risk independent of the design points, in this paper 
we will make use of the risk with respect to the usual norm in £2^-, 1] (13.21) 
also, i.e. 

T„(5,5) = sup E^JI^-Sf . (4.3) 

If an estimator S is defined only at the design points (a;j)i<j<„, then we 

extend it as step function onto the interval [0, 1] by setting S{x) = T{S{x)), 
for all < x < 1, where 

n 

T{f){x) = /(xi)lp,,,](x) + ^ /(Xfc)l(,^_^,,^](x) . (4.4) 

k=2 

In Galtchouk, Pergamenshchikov, 2007, 2008 the following non-asymptotic 
oracle inequality has been shown for the procedure (I2.12p . 

Theorem 4.1. Assume that in the model (11.11) the function S belongs to W\ 
Then, for any odd n > 3 and r > 0, the estimator satisfies the following 
oracle inequality 

S) < ^^^P-J^P' niin 7^J^„ S) + - B^p) , (4.5) 
i — op agA n 

where the function B^{p) is such that, for any u > 0, 

lim = . (4.6) 

n— >oo n 

Remark 4.1. Note that in Galtchouk, Pergamenshchikov, 2007, 2008, the 
oracle inequality is proved for the model (II. ip . where the random variables 
{Cj)i<j<n ^'^s independent identically distributed. In fact, the result and the 
proof are true for independent random variables which are not identically 
distributed, i.e. for any distribution of the random vector (^j^, . . . ,^„)' from 
V . 

Now we formulate the main asymptotic results. To this end, for any 
function S G W^, we set 

7fc(^) = r*r^(^(S))2'=/(2'=+i), (4.7) 

where 

r* = {2k + 1)^ (A;/(7r {k + l)))2'=/(2fc+i) . 
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It is well known (see e.g., Nussbaum, 1985) that the optimal rate of 
convergence is ri2fc/(2fc+i) -^j^g^ ^^j^g j-isk is taken uniformly over W^^. 

Theorem 4.2. Assume that in the model (II. ip the sequence {aj{S)) fulfills 
the condition Hi). Then the estimator from (12.121) satisfies the inequalities 

2fc 7^ (S 

limsupn^^TT sup *' ^ <1 (4.8) 

r 

and ^ 

limsup sup " /' <1. (4.9) 



k 



n^oo Sew 

1 

The following result gives the sharp lower bound for risk (14. 2 p and show 
that "ykiS) is the Pinsker constant. 

Theorem 4.3. Assume that in the model (II. ip the sequence {aj{S)) satisfies 
the conditions H2)- H4). Then the risks i\A.2\) and (14. 3 p admit the following 
asymptotic lower bounds 

liminf inf sup '^"^'^"'.'^^ > 1 (4.10) 

s„ sew" ^k{S) 

r 

and ^ 

liminf inf sup ^"^'^"''^^ >1. (4.11) 

s„ sew" lk{S) 

r 

Remark 4.2. Note that in Galtchouk, Pergamenshchikov, 2005, an asymp- 
totically efficient estimator has been constructed and results similar to Theo- 
rems \4-^ and \4-3\ were claimed for the model (II. ip . In fact the upper bound is 
true there under some additional condition on the smoothness of the function 
S , i.e. on the parameter k. In the cited paper this additional condition is not 
formulated since erroneous inequality {A.6). To avoid using this inequality 
we modify the estimating procedure by introducing the penalty term pP,^{X) 
in the cost function (12. 9p . By this way we remove all additional conditions 
on the smoothness parameter k. 

Remark 4.3. In fact to obtain the non- asymptotic oracle inequality (14.50 . 
it isn't necessary to make use of equidistant design points and the trigono- 
metric basis. One may take any design points (deterministic or random) and 
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any orthonormal basis satisfying fl2.2l) . But to obtain the property fl4.6p one 
needs to impose some technical conditions (see Galtchouk, Pergamenshchikov, 
2008). 

Note that the results of Theorem \4.2\ and Theorem \4.3\ are based on 
equidistant design points and the trigonometric basis. 



5 Upper bound 

In this section we prove Theorem I4.2[ To this end we will make use of 
the oracle inequality (14.51) . We have to find an estimator from the family 
f l2.4l) - fl2.5p for which we can show the upper bound (14. 8p . We start with the 
construction of such an estimator. First we put 

r„ = mi{t >1 : ie> r{S)} A m and f{S) = r/^{S) , (5.1) 

where a Ah = min(a, b). 

Then we choose an index from the set As as 

a = {k,tj, (5.2) 

where k is the parameter of the set W^^ and t^ = l^e. Finally, we set 

S = Sj^ and A = . (5.3) 
Now we show the upper bound (14.80 for this estimator. 
Theorem 5.1. Assume that the condition Hi) holds. Then 

limsup n2fc+i sup <1. (5.4) 

Remark 5.1. Note that the estimator S belongs to the family (12. 4p - (12.51) . 

but we can't use directly this estimator because the parameters k, r and f{S) 
are unknown. We can use this upper bound only through the oracle inequality 
(14. 5p proved for procedure (I2.12p . 

Now Theorem 14.11 and Theorem 15.11 imply the upper bound ( 14.80 . To 
obtain the upper bound (14.90 we need the following auxiliary result. 
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Lemma 5.2. For any < S <1 and any estimate of S ^ , 

\\Sr.-s\\i>{i-6)\\us)-sr - - l)r/n^ 

where the function T„(S')(-) is defined in (14.41) . 

Proof of this Lemma is given in Appendix lA.ll 

Now inequality (14. 8 p and this lemma imply the upper bound (14.91) . Hence 
Theorem 14. 2[ 

6 Lower bound 

In this section we give the main steps of proving the lower bounds (14.101) 
and (14.111) . In common, we follow the same scheme as Nussbaum, 1985. 
We begin with minorizing the minimax risk by a bayesian one constructed 
on a parametric functional family introduced in Section W72\ ( see (16. 9p ) and 
using the prior distribution (I6.10p . Further, a special modification of the van 
Trees inequality (see. Theorem 16. II) yields a lower bound for the bayesian risk 
depending on the chosen prior distribution, of course. Finally, in section 16. 3[ 
we choose parameters of the prior distribution (see (I6.10p ) providing the 
maximal value of the lower bound for the bayesian risk. This value coincides 
with the Pinsker constant as it is shown in Section 19.21 

6.1 Lower bound for parametric heteroscedastic re- 
gression models 

Let (M", B{W), P^, ^9 G e C M') be a statistical model relative to the obser- 
vations (l/j)i<j<„ governed by the regression equation 

= S^{x^) (6.1) 

where ^i, • • • , ^„ are i.i.d. A/'(0, 1) random variables, ^ = (t^i, . . . , -i?/)' is a 
unknown parameter vector, S^{x) is a unknown (or known) function and 
o'j{'d) = g{xj,S^), with the function g{x,S) defined in the condition Hi). 
Assume that a prior distribution /i^ of the parameter in W is defined by 
the density $(■) of the following form 

I 

i=l 
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where </7j is a continuously differentiable bounded density on IR with 



/, = / Zi^du<oo. 



Let r(-) be a continuously differentiable M' — > M function such that, for any 
l<i<l, 

lim T{z)(p^{Zj) = and / \r^{z)\ <^>{z)dz < oo, (6.2) 

where 

r;(z) = (9/9zJ r(^) . 

Let T„ be an estimator of t(i?) based on observations (?/j)i<j<„- For any 
B(R" X R') - measurable integrable function G{x,z),x e M.^,z e R', we set 

EG{Y,^) = [ E^G{Y,z)^{z)dz, 

where is the expectation with respect to the distribution of the vector 
Y — {y-^, . . . ,y^). Note that in this case 



E^G(r,^) = / G{v,^)f{v,^)dv, 



where 



We prove the following result. 

Theorem 6.1. Assume that the conditions Hi) — H2) hold. Moreover, as- 
sume that the function S^{-) with z — {z^, . . . , Zi)' is uniformly overO < x < 1 
differentiable with respect to z^, 1 < i < I, i.e. for any 1 <i < I there exists 
a function S'^ ^ e C[0, 1] such that 

lim max [s ,^^^^{x) - S ,{x) - S'^ ^{x)h) / h =0, (6.4) 

where e- — (0, 1, 0)', all coordinates are 0, except the i-th equals to 1 . 
Then for any square integrable estimator of ri^d) and any 1 <i <l, 

Efe-rW)-> ^4^^ . (6.5) 
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where = J^, r[{z) ^{z)<lz, = t i^'J^j^/^A^))' ^(')^^ 

Lj(x, z) = 5 {S'^ ■), the operator g is defined in the condition H2). 
Proof is given in Appendix IA.2I 

Remark 6.1. Note that the inequality fl6.5p is some modification of the van 
Trees inequality (see, Gill, Levit, 1995) adapted to the model (16.11) . 



6.2 Parametric family of kernel functions 

In this section we define and study some special parametric family of kernel 
function which will be used to prove the sharp lower bound (14.101) . 
Let us begin by kernel functions. We fix > and we set 



i(i«i<i-r,) y 



U — X 



V 



du , (6.6) 
where 1a is the indicator of a set A, the kernel V G C°°(R) is such that 



V{u) = for \u\ > 1 and 



V{u) du = l 



-1 



It is easy to see that the function Xrji^) possesses the properties : 

< X,, ^ 1 ; Xr,i^) = 1 foi' < 1 — 2?7 and 
Xrii^) = for |x| > 1 . 

Moreover, for any c > and u > 

"1 



lim sup 



f{x)dx 



0. 



(6.7) 



We divide the interval [0, 1] into M equal subintervals of length 2h and 
on each of them we construct a kernel-type function which equals to zero at 
the boundary of the subinterval together with all derivatives. 
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It provides that the Fourier partial sums with respect to the trigonometric 
basis in ilgf"!)!] give a natural parametric approximation to the function 
on each subinterval. 

Let (ej)j>i be the trigonometric basis in £2[— 1, 1], i-e. 

= I/V2 , e^.(x) = Tr^ in[j/2]x) , j > 2 , (6.8) 

where the functions {Tr-)-^2 ^-re defined in (12. ip . 

Now, for any array z = {iZm,j)i<m<M^,i<j<N„} define the following 
function 

Sz,nix) = J2Y1 ^rn,j ^mj (a^) , (6-9) 
m=l j=l 

where D^ j{x) = {v^{x)) (vmi^)), 

Vm{x) = {x- x^)/h^ , Xm = 2mh^ and M„ = [l/{2hj] - 1 . 

We assume that the sequences {N^)n>i aiid (/i„)„>i, satisfy the following 
conditions. 

A;^) The sequence — >■ oo as n ^ oo and for any z/ > 

lim N^/n = . 

n— >oo 

Moreover, there exist < 5^ < 1 and ^2 > such that 

= O(n^^i) and h^^ = 0{n^'^) as n oo . 

To define a prior distribution on the family of arrays, we choose the following 
random array ^9 = {{i^m,j)i<m<M„,i<j<Nj with 

^m,j ~ ^mj Cmj ' (6.10) 

where (Cmj) i-i-d. Af{0, 1) random variables and (t^j) are some non- 
random positive coefficients. We make use of gaussian variables since they 
possess the minimal Fisher information and therefore maximize the lower 
bound (16. 5p . We set 

C=.™g,.^*.„- (6.11) 
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We assume that the coefficients (^m,j)i<m<M ,i<j<N satisfy the following 
conditions. 

A2) There exists a sequence of positive numbers {d^)n>i such that 

E E t,f^'-'^ = , lim = , (6.12) 



^2k- 

m=l j=l 

moreover, for any u > 0, 

lim n'^ exp{— (i„/2} = . 



A3) For some < e < 1 



m=l 7=1 



A4) There exists Cq > such that 



lim — y y .f" =0. 



m=l j=l 

Proposition 6.2. Lei i/ie conditions A^)-A2). Then, for any u > and 
for any S > 0, 

lim n" max P > ^) = . 

n~»oo 0<«<fc-l V ' / 

Proposition 6.3. Let the conditions A^)-A^). Then, for any u > 0, 

lim n''P{S,^^^W^') =0. 

n— >oo 

Proposition 6.4. Let the conditions A^)-A4). Then, for any u > 0, 
lim n^E||^^^„f (l{5,„^iy^n + ^) 

Proposition 6.5. Let the conditions A-|^)-A4). Then for any function g 
satisfying the conditions (13.71) and H^) 

lim sup E I g~'^{x, 5^ „) - g~'^{x) | = . 

n^oo o<x<l 

Proofs of Propositions I6.2H6.5I are given in Appendix. 
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6.3 Bayes risk 

Now we will obtain the lower bound for the bayesian risk that yields the 
lower bound (14. lip for the minimax risk. 

We make use of the sequence of random functions (5'^ „)„>i defined in 
f l6.9l) - fl6.10l) with the coefficients (t^j) satisfying the conditions A^)-A^) 
which will be chosen later. 

For any estimator S'„ we introduce now the corresponding Bayes risk 



where the kernel family (S^^) is defined in (16. 9p . /i^ denotes the distribution 
of the random array t9 defined by ( KWf in W with / = M^N^. 

We remember that g is a centered gaussian distribution in M" with unit 
covariation matrix. 

First of all, we replace the functions and S by their Fourier series with 
respect to the basis 




(6.13) 



i\Vmix)\<l) ■ 




n 



s. 



z,n 



^ from 



m=l j=l 



where 




and 




Moreover, from the definition (16. 9p one gets 




i=l 



It is easy to see that the functions T^j(-) satisfy the condition (16.21) for 
gaussian prior densities. In this case (see the definition in (16.51) ) we have 
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where 



e,(/)= / e'{v)fiv)dv. 



-1 



Now to obtain a lower bound for the Bayes risk S^{S.^) we make 
Theorem 16.11 which imphes that 



m=l j=l ~^ rn,j ' m,j 



where F^^^ = Dl ix,)^g-\x,,S,^^) and 



1 " 



with S') = L^ ^5 ,^). In the Appendix we show that 



and 



hm sup sup 

n^oo l<m<A/„ l<i<Af„ 



hm 



— F 



sup sup 

n^oo nil l<m<M„ l<j<Af„ 



This means that, for any z/ > and for sufficiently large n, 



sup sup -2/~ ^ 



<! + !/. 



where is defined in (16 .Qp . Therefore, if we denote in (16.151) 



nhg-\x^)t1^^ and ^^.(ry, = -^^^^^^ 



we obtain, for sufficiently large n. 



m=l 



(6.14) 
use of 

(6.15) 



(6.16) 



(6.17) 
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In the Appendix we show that 



hm sup sup 



^jv(z/i,--->Z/jv) 



(6.18) 



where 



N 



*iv(?/i,---,l/jv) = 



j=i J 

Therefore we can write that, for sufficiently large n, 



2k 1 — U _ 1 V — \ r, 



(6.19) 



m=l 



Obviously, to obtain a "good" lower bound for the risk SniS^) one needs 
to maximize the right-hand side of the inequality (16.191) . Hence we choose 
the coefficients (^t^p by maximization the function \E'^, i.e. 



AT 



max "^jsfiyi, . . . , I/tv) subject to ^ y^f' < R . 



The parameter i? > will be chosen later to satisfy the condition A3). By 
the Lagrange multipliers method it is easy to find that the solution of this 
problem is given by 

y*{R) = a\R)3-'^-l (6.20) 

with 

1 



N 



a*{R) 



■2k 



and 1 < j < . 



i=l 



To obtain a positive solution in fl6.20p we need to impose the following 
condition 



N 



N 



R> N'^ E^'-E 



:2k 



(6.21) 



i=l 



i=l 



Moreover, from the condition A3) we obtain that 

22k+Ui_^y^^2k+l 

R < ^ kJz := R* 



(6.22) 
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where 



^0 = 2/i„ Yl 3o 



mj 
m=l 



Note that by the condition H4) the function g^^-) = g{-, Sq) is continuous on 
the interval [0, 1], therefore 

hm % = g^i^i 'S'o)da; = ^{Sq) with = . (6.23) 
Now we have to choose the sequence {h^). Note that if we put in fl6.1Up 
tm,j = 9o{xm)^y*{R)/Vnh^ i.e. k^j = y*{R) , (6.24) 
we can rewrite the inequahty (16.191) as 

n^^J^J > , (6.25) 

where 



It is clear that 

k^l{k + \f < liminf inf ^*^{R)/N < limsup sup ^^(i?)/A^ < 1 . 

Therefore to obtain a positive finite asymptotic lower bound in fl6.25p we 
have to take the parameter as 

hn = Kn-"^''^'^N^ (6.26) 

with some positive coefficient h^. Moreover, the conditions (I6.2ip - fl6.22p im- 
ply that, for sufficiently large n. 



N N 

'I - £)r - — > V l'' — V 
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Moreover, taking into account that for sufficiently large n 

N, 



1 " 

we obtain the following condition on 

/i* > (6.27) 

where 

' c;{k + l){2k + 1) ' 7r2'=g(^o) ■ 

To maximize the function \E'* (R) on the right-hand side of the inequality 

n 

fl6.25p we take R = i?* defined in (16.221) . Therefore we obtain that 

liminf inf n^'/^^'+'^S^{SJ > ^iS^) F{hJ/2 , (6.28) 

n^oo s 

where 

Fx 



X (A; + l)2(c*(2A;+ l)x2^+2 + x) ■ 
Furthermore, taking into account that 

F' x) = -^^ ■ - ■ - — < 

^ ' {k + l)2(c*(2fc + l)x2'=+2 + x)2 - 



we get 



max FiK) = F((t;*)i/(2'=+^)) = iilfl^ (^*^-i/(2^-+i) 



e ^ 



where 

e' = . (6.29) 

2k + ek + l ^ ' 

This means that to obtain in fl6.28p the maximal lower bound one has to 
take in (Km 

= (^*)l/(2fc+l) _ ^g_3Q) 

It is important to note that if one defines the prior distribution /i^ in the 
bayesian risk fl67[3|) by formulas fl630|) . flQij) . flOe]) and fICTD . then the 
bayesian risk would depend on a parameter < e < 1, i.e. = 
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Therefore, the inequahty 06.281) imphes that, for any < e < 1, 



hminf inf n"^/^"^+'^£,JSJ > ^ lk{S,) , (6.31) 

where the function jkiSo) is defined in (14. 7p for Sq = 0. 

Now to end the definition of the sequence of the random functions (5"^ 
defined by (16. 9p and (I6.10p one has to define the sequence (N^). Let us 
remember that we make use of the sequence {S^^^J with the coefficients (t^ j) 
constructed in (16.241) for R = R* given in (16.221) and for the sequence given 
by (16.261) and (I6.30p for some fixed arbitrary < e < 1. 

We will choose the sequence (iV^) to satisfy the conditions A;^)-A4). One 
can take, for example, = [In^ra] + 1. Then the condition A^) is trivial. 
Moreover, taking into account that in this case 

^2kg^ e n {k + l)(2k + l) " 



we find thanks to the convergence (16.230 

n ^7 = 1 



lim ^7 = 1 . 



Therefore, the solution (16.201) . for sufficiently large n, satisfies the following 
inequality 

max y*{R*)f < 2N^ . 

Now it is easy to see that the condition A2) holds with = ^^/N^ and the 
condition A4) holds for arbitrary < < 1. As to the condition A3), note 
that in view of the definition of j in (I6.24p we get 



N„ 

_ ,, , 7, , n 1 -2'= 

n- m=l j=l " j=l 

2k 



-inn -1 n 

_^ ,2 ■2k _ ^ ^ *( R*\ 



n^O _ q _ _ 



e 



Hence the condition A3). 
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7 Estimation of non periodic function 



Now we consider the estimation problem of the non periodic regression func- 
tion 5* in the model (11. ip . In this case we will estimate the function S on 
any interior interval [a, b] of [0, 1], i.e. for < a < 6 < 1. 

It should be pointed out that at the boundary points x = and x = 1, 
one must to make use of kernel estimators (see Brua, 2007). 

Let now x be a infinitely differentiable [0, 1] — *■ M_,_ function such that 
x{x) = 1 for a < X <b and x^'^K^) — X^'^K^) = ioi all k >0, for example, 

xix) = j_ y (^) ' 

where V is some kernel function introduced in (16. 6p . 

a=— , = — I — and n = - mm a , 1 — o . 
2 2 2 ' 4 ^ ' ' 

Multiplying the equation (11.11) by the function x(-) and simulating the i.i.d. 
A/'(0, 1) sequence {Cj)i<j<n comes to the estimation problem of the peri- 
odic regression function S{x) = S{x)x{x), i.e. 

Vi = S{x.) + aAS)Z, 



where crj{S) = a'^.{S) + e 



and e > is some sufficiently small parameter. 

It is easy to see that if the sequence {crj^S))^^^^^ satisfies the conditions 
- H4), then the sequence {o'j{S))i^j^^ satisfies these conditions as well 

with 



(^jiS) = g{xj, S) = ^Jg'^{Xj,S)x^{x^) 
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8 Conclusions 



In conclusion, it should be noted that this paper completes the investiga- 
tion of the estimation problem of the nonparametric regression function for 
the heteroscedastic regression model (11.11) in the case of quadratic risk. It 
is proved that the adaptive procedure (12.121) satisfies the non asymptotic 
oracle inequality and it is asymptotically efficient for estimating a periodic 
regression function. Moreover, in Section [7] we have explained how to apply 
the procedure to the case of non periodic function. As far as we know, the 
procedure (12.121) is unique for estimating the regression function at the model 
(11.11) . Let us remember once more the main steps of this investigation. The 
procedure (I2.12p combines the both principal aspects of nonparametric es- 
timation: non asymptotic and asymptotic. Non-asymptotic aspect is based 
on the selection model procedure with penalization (see e.g., Barron, Birge 
and Massart, 1999, or Fourdrinier, Pergamenshchikov, 2007). Our selection 
model procedure differs from the commonly used one by a small coefficient in 
the penalty term going to zero that provides the sharp non-asymptotic oracle 
inequality. Moreover, the commonly used selection model procedure is based 
on the least-squares estimators whereas our procedure uses weighted least- 
squares estimators with the weights minimizing the asymptotic quadratic 
risk that provides the asymptotic efficiency, as the final result. From prac- 
tical point of view, the procedure (I2.12p gives an acceptable accuracy even 
for small samples as it is shown via simulations by Galtchouk, Pergamen- 
shchikov, 2008. 

9 Proofs 

9.1 Proof of Theorem 15.11 

To prove the theorem we will adapt to the heteroscedastic case the corre- 
sponding proof from Nussbaum, 1985. 

First, from (12.41) we obtain that, for any p G P„, 




(9.1) 
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where 

n 

^.n{S) = -Y.a',{S^{x,). 



n 
1=1 



Setting now uj = u^a) with the function uj defined in (12.80 . the index a 
defined in dO]), jo = p£^„], ji = P/^„] and 

1=1 

we rewrite (19. ip as follows 

^sJS-S\\l= X; (1- A/^L (9-2) 
i=io+i 



with 



n 1 



n -1 

Note that we have decomposed the first term on the right-hand of (19.11) into 
the sum 

i=io+i 

This decomposition allows us to show that „ is negligible and further to 
approximate the first term by a similar term in which the coefficients 6j ^ will 
be replaced by the Fourier coefficients 6j of the function 5*. 

Taking into account the definition of u in (12. 8p we can bound uj as 

Therefore, by Lemma [A. 11 we obtain 

lim sup n2fe+i ^ = . 
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Let us consider now the next term Ag^. We have 



Aon I = ^ 



i=l 



< — sup 

0<x<l 



J2 - 1) 



Now by Lemma IA.2I and the definition (12 .Tp we obtain directly the same 
property for A2„, i.e. 



Setting 



hm sup n^fc+i \^2n\ = 0- 



and applying the well-known inequality 

{a + hf < (1 + (5)a2 + (1 + 1/5)62 

to the first term on the right-hand side of the inequality (19.21) we obtain that, 
for any S > and for any p eV^, 



+ Ai,„ + A2,, + (1 + 1/5)A3^„, 

where 

Kn = E - . 

i=io+i 

Taking into account that k>l and that 

we can show through Lemma [A .31 that 

lim sup n2fe+i A3 ^ = . 



(9.3) 
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Therefore, the inequahty fl9.3p yields 

hmsupn2fc+i sup 71^{S , S) /'^k{S) < hmsup sup ^k,n{S)hk{S) 

and to prove (15.41) it suffices to show that 

hmsup sup lk,niS)/lk{S) < 1. (9.4) 

n— ►oo 5eiyfc 

First, it should be noted that the definition (15.11) and the inequalities (13. 7p - 
(13.81) imply directly 

lim sup \t^/f{S) — l| = . 

Moreover, by the definition of {^j)i<_j<n (15.31) . for sufficiently large n, for 
which t„ > f(S) we find 

sup = n-'\Ajj-"'/^''+'^ < n-'\A,r{S)r^''/^^'+'^ . 

j>i (^j) 

Therefore, by the definition of the coefficients (aj)j>i in (13.51) 

limsupnsiTT sup sup 7r^\Aj^r{S)y''/^^''+^\l -X^)ya^ < 1. 
fi^oo sewp j>jo 

Furthermore, in view of the definition (12.71) we calculate directly 



lim sup 



n _ pi 

n-^^ ^ - {Ajj{S))^^ / (1 - z^fdz 

Jn 



0. 



Now, the definition of in (13. 3p and the condition (13. 6p imply the inequality 
(I93D. Hence Theorem O □ 



9.2 Proof of Theorem [ITS] 

In this section we prove Theorem 14.31 Lemma 15.21 implies that to prove the 
lower bounds (I4.10p and (14. lip , it suffices to show 

2k 

liminf inf 7^o(5„) > 1 , (9.5) 

n— >oo S 
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where 



7^o(5J = sup EsJS^-Srh,{S) 



For any estimator S^, we denote by its projection onto W.l^, i.e. 
5° = Pr^fc(S'„). Since VF^^ is a convex set, we get 

ii^„-5f >ii^;-^ir- 

Now we introduce the following set 

S„ = { max max < dj , (9.6) 

where (Cmj) i-i-*^- -^(O; 1) random variables from fl6.10p and the sequence 
{d'n)n>i is given in the condition A2). Therefore, we can write that 

^ f q 1 1 ^ n ^ ^ ^ 1 1 

Here the kernel function family {S^ „) is given in (K9h in which A^'^ = [In^ n] +1 
and the parameter h is defined in (16.261) and (16.301) ; the measure /i^ is defined 
in (16.131) . Moreover, note that on the set S the random function 5"^^ is 
uniformly bounded, i.e. 

Il'^^,nlloo= sup |5^,„(x)| <^/d^f, (9.7) 

0<x<l 

where the coefficient t* is defined in (16. lip . 

Thus, we estimate the risk 7lo{S^) from below as 

with 

7:= sup ^k{S). (9.8) 

\\s\\o.<V<t: 

By making use of the Bayes risk (16.130 with the prior distribution given by 
formulae (16.101) . (16.241) . (I6.26P and (I6.30p for any fixed parameter < e < 1 
we rewrite the lower bound for 7^o('S'„) as 

7^o(5J>f,,„(5°)/7:-2^]„/7: (9.9) 
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with 

In Section [6l3] we proved that the parameters in chosen prior distribution 
satisfy the conditions A-^)-A^). Therefore Propositions I6.3H6.4I and the hmit 
f lA.4|) imply that, for any u > 0, 

hm n'^ = . 

n^oo 

Moreover, by the condition H4) the sequence 7* goes to •ykiSo) as n — >• 00. 
Therefore, from this, (16.311) and (19.91) we get, for any < e < 1, 

hminf inf 7^o(5J > ^ ^'^^^ ~ f ^ . 

where e' is defined in (16.291) . Limiting here e — > imphes inequahty (19. 5p . 
Hence Theorem 14. 3[ □ 

10 Appendix 

A.l Proof of Lemma 15.21 

First notice that, for any S G W^, one has 

s\\i = wus) - sr + Al^+ Al^, 

where ^ 

^In = 2E r (S^i^j) - S{xmS{x) - S{x^))dx 

j=l "'aij-i 

and ^ 

j=l Xj.l 

For any < 5 < 1, by making use of the elementary inequality 

2ah < 6a^ + , 

one gets 

< 5\\US)-Sr + 5-'Al^. 
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Moreover, for any S G W^! with A; > 1, by the Bounyakovskii-Cauchy-Schwarz 
inequahty we obtain that 



2,n - ^2 ^ / W ^2 II II - 



Hence Lemma [5.21 □ 

A.2 Proof of Theorem D 

For any z = {z-^^, . . . , Zi)' G M" we set 

Note that due to the condition (13. 7p . the density (16. 3p is bounded, i.e. 

f{v,z)<{2ng^)--/'. 
So through (16.21) we obtain that 

lim r(2;)/(w,z)(^.(z.) = 0. 

Therefore, integrating by parts yields 

E(r„-r(^))f^.= / [rjv) -r{z))^{f{v,zMz))dzdv 

J jjn + i OZ^ 

^ t{z)) ^{z) ( [ fiy, z)dv] dz = Ti. 



I \dz.^ 

Now the Bounyakovskii-Cauchy-Schwarz inequahty gives the following lower 
bound 

E(f„ - r(^))^ > rVEf,^ 
To estimate the denominator in the last ratio, note that 

g^{v,z) = l{v,z) + ^^ With Uv,z) = {d/dz,)\r^f{v,z). 



^From (16.11) it follows that 
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Moreover, the conditions H2) and fl6.4p imply 

{d/dz,)a]{z) = {d/dz^)g\x^,S:) = L,(x,, 2;) 
from which it follows 

E [Uy,^)) =F, + B,. 
This implies inequality (16 .51) . Hence Theorem 16.11 □ 

A. 3 Proof of Proposition 16.21 

First note that, for < x < 1, we can represent the Ith derivative as 

. M„ I 

'^S(^) = E (9 (A.i) 

m=l i=o 

where 

i=i 

Therefore 

n m=l 1 \j=0 / 

and by the Bounyakovskii-Cauchy-Schwarz inequality we obtain that 
with C*{l,v) = max„i<,<i ((0 4'^'^^))' and 
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Now we show that, for any < i < k — 1 and 5 > 0, 

hm P (Q^{^) > 6hl^-^) = . (A.3) 

n^oo 

To this end note that 

< (-Tyt' fx' . 



Therefore, taking into account the definition of the set in fl9.6p . the func- 
tions Q [&) with < 2 < — 1 can be estimated on this set as 



m=l j=l 

and by (16.121) we get, for any 5 > and sufficiently large n, 

p (g,(^) > < p (s^) . 

Moreover, for sufficiently large n. 

Therefore, the condition A^^) implies 

limsup n^P (S^) = 0, (A.4) 

n— >oo 

for any > 0. Hence Proposition 16.21 □ 

A.4 Proof of Proposition 16.31 

First of all we prove that for e from the condition A3) 

lim n'^vhsf^J > y/{l - e/4)r) =0. (A.5) 

n—>oo V ' / 
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Indeed, putting in (lA.ip I = k we can represent the kth derivative of „ as 
follows 

Si%^) = S^x) + S':jx) (A.6) 

with 

. M„ k-i 

m=l i=o 

and 

1 

m=l 

First, note that, we can estimate the norm of S'^^{x) by the same way as in 
the inequality flA.2p . i.e. 

By making use of flA.Sp we obtain that, for any p > and for any 6 > 0, 

hm n''p(\\S[J>5) =0. (A.7) 

n— >oo V ' / 

Let us consider now the last term in flA.6p . Taking into account that 
< Xf,iv) < 1 we get 



II S"' IP =- 

II z,n II f^2k 



1 " r 

m=l ^ 



m=l j=l 

Therefore from the condition A3) we get for sufficiently large n 

M 

/TTX 2fc _ /7r\ 2fc 

ll^gr <(l-e/2)r+(-) EC-(l-^/2)r+(2) 

m=l 

with 

1 ^" ^ 
7 =__\^/2 ■2k^ and ^ = _ . 
U2k-l m,jJ >>m,j "^^^^ SmJ S^j 



/^2fc- 
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We show that for any z/ > and for any S > 

hm n^P(|Z„| >6) = 0. (A.8) 

Indeed, by the Chebychev inequahty for any l > 

Pi\Zj>5)< (A.9) 

Note now that according to the Burkholder-Davis-Gundy inequahty for any 
i > 1 there exists a constant B*{l) > such that 

\m=l 

Moreover, by putting 

C = max max • 

l<m<M„ l<j<N„ 'J 

we can estimate the random variable C as 

AT ^" 
— ^4fc— 2 / ^ m,i ^* ' 

Therefore, by the condition A4), for sufficiently large n, 

\2t 



\ n/ — ^ ' n n ^* 



n n 

< B* U) E (C^ - l)^' MN'+^h''o , 
where ( ~ A/"(0, 1). Now the condition A^^) implies, for sufficiently large n, 

E(Z„)'' < n-^^ ('^0-2). 

Thus, choosing in (1A.9P 

L > ///(co^i) + 2/eo 

we obtain the limiting equality flA.Sp which together with flA.6p - flA.7p implies 
f lA.Sp . Now it is easy to deduce that Proposition 16.21 yields Proposition 16.31 
□ 
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A. 5 Proof of Proposition 16.41 

First of all, we recall that, due to the condition A^), 



hm yyt' .< hm -^yyf .j2('=-i) = o. 

m=l j=l " m=l i=l 

Therefore, taking into account that 

m=l j=l 

we obtain, for sufficiently large n, 

E \\S^,nf (l{5,,„^M/;=} + 1=;;) < max E^^^ (l{s,,„^ty^n + Is^J • 

Moreover, for any 1 < m < M„ and 1 < j < A^„, we estimate the last term 

as 



where C ~ -^(0, 1). By applying now Proposition 16.31 and the limit flA.4p we 
come to Proposition 16 .41 □ 

A. 6 Proof of Proposition 16.51 

Taking into account the inequality (19.71) and the condition H^^) we obtain 

E \g-'^{x,S^J - g^^{x)\ < max \g~^{x, S) - g^^ix)] 

+ iV9.) P (H^) . 

Conditions Ag) and H4) together with the limit relation flA.4p imply 
Proposition 16.51 □ 
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A. 7 Properties of the trigonometric basis 
Lemma A.l. For any function S G , 



sup sup m 

n>l l<m<n—l 



,2k 



y 9' 



< 



4r 



(A.ll) 



Lemma A. 2. For any m > 0, 



sup sup N 

N>2 xe[0,l] 



N 



1=2 



< T 



(A.12) 



Proofs of Lemma [A . 1 1 and Lemma [A. 21 are given in Galtchouk, Pergamen- 
shchikov, 2007. 

Lemma A. 3. Let 9j ,^ and 9j be the Fourier coefficients defined in (12. 3p and 
( 13.41) . respectively. Then, for 1 < j < n and n >2, 



(A13) 



sup \Oj,n-^j\ < 2ny/rj/n. 



Proof. Indeed, we have 



J2 r {S{xi)<P^{xi)~S{x)<P^{x))dx 

J 1 X, ^ 



1=1 ^ -^i-i 



^ E r + \s{z)^,{z)\)dz 



1=1 "'■^i-l 
"1 



n 



\S{z)\\<p^{z)\ + \S{z)\\<p^{z)\)dz. 

By making use of the Bounyakovskii-Cauchy-Schwarz inequahty we get 

, II + 1101111^1 

< n-' (\\S\\ + vrj ll^ll) . 

The definition of class imphes (IA.13p . Hence Lemma [A. 1[ □ 



l^.„-^.l<-" ll^l 
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A.8 Proofs of (Km and (KITh 



First of all, note that Proposition I6.5|, the condition (13.71) and the condition 
H4) imply that 

lim max sup l{|„^^^(^)[<i| E |^"^(a;, 5^ ) - ^^^(x^) | = . (A.14) 

n^ool<m<M„ o<2:<l 

Let US show now that for any continuously different iable function / on [—1,1] 



lim sup 

n^oo l<m<Af„ 

Indeed, setting 



1 " 

^ Yl fM^^))^{Mx,:)\<l} - J /(^)d^ 

i=l ^ 



0. (A.15) 



1 " 

1=1 "^-1 



we deduce 



A. 



<E r^^^^ \f{vUx,))-f{z)\dz + max\f{z)\{2-v* + vJ, 

Jv (x ) 1^1=^1 

where = [nx.^ — nh] + 1, = [nx^ + nh], 

= {[nx.^ — nh] + 1 — nx^)/ {nh) and v* = {[nx^ + nh] — nx^)/ {nh) . 

Therefore, taking into account that the derivative of the function / is bounded 
on the interval [—1, 1] we obtain that 



A < 

I n,m| — 



3max|^l<i \f{z) \ + 2max,^,<i \f{z) 



nh^ 



Taking into account the conditions on the sequence (/i„)„>i given in A^) we 
obtain limiting equality (1A.15|) which together with (1A.14|) implies (16.161) . 
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Now we study the behavior of B^j. Due to the inequahty 03.91) we 
estimate the Frechet derivative as 

|L„,,(x, V)l < C* {\S^,nix)D^Ax)\ + \D^J, + II VII ll^™,,ll) • (A.16) 

Consider now the fisrt term on the right-hand side of this inequahty. We 
have 



, 1=1 



{x)\<i} ^ y^n) ■'{if™wi<i} ■ 

We recall that the sequence t* is defined in (16.111) . Therefore, property (1A.15P 
implies 

1 " 

max max — E( V(xjD„,^.(a;J)2 = 0((r )2) . 

i=l 

As to the second term on the right-hand side of (1A.16I1 . we get 



\Dm,j\i= {(^ji^mi^)) XniVmi^))\dx = h \ej{v) Xr,{v)\dv < 2h ^ 
Similarly, \\D^jf < h and, by fOTTOj) 

m=l j=l 



Therefore, 



^ max max |5 .| = 0{{t*J^ + hj 



and the condition A^) implies fl6.17p . □ 
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A.9 Proof of (Km 

Indeed, by the direct calculation it easy to see that, for any > 1 and for 
any vector (y^, . . . , y^)' E M^, 



• Vn) 



< 



mm 



where the operator ej{f) is defined in f l6.14p . Moreover, we remember that 
J^^ e^(f)dt> = 1. Therefore, taking into account the property (16.71) we obtain 
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